Finding Weak Motifs in DNA Sequences

نویسندگان

  • Sing-Hoi Sze
  • Mikhail S. Gelfand
  • Pavel A. Pevzner
چکیده

Recognition of regulatory sites in unaligned DNA sequences is an old and well-studied problem in computational molecular biology. Recently, large-scale expression studies and comparative genomics brought this problem into a spotlight by generating a large number of samples with unknown regulatory signals. Here we develop algorithms for recognition of signals in corrupted samples (where only a fraction of sequences contain sites) with biased nucleotide composition. We further benchmark these and other algorithms on several bacterial and archaeal sites in a setting specifically designed to imitate the situations arising in comparative genomics studies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Functional motifs in Escherichia coli NC101

Escherichia coli (E. coli) bacteria can damage DNA of the gut lining cells and may encourage the development of colon cancer according to recent reports. Genetic switches are specific sequence motifs and many of them are drug targets. It is interesting to know motifs and their location in sequences. At the present study, Gibbs sampler algorithm was used in order to predict and find functional m...

متن کامل

Graphical approach to weak motif recognition.

We address the weak motif recognition problem in DNA sequences, which extends the general motif recognition to more difficult cases, allowing more degenerations in motif instances. Several algorithms have earlier attempted to find weak motifs in DNA sequences but with limitations. In this paper, we propose a graph-based algorithm for weak motif detection, which uses dynamic programming approach...

متن کامل

Apples to apples: improving the performance of motif finders and their significance analysis in the Twilight Zone

MOTIVATION Effective algorithms for finding relatively weak motifs are an important practical necessity while scanning long DNA sequences for regulatory elements. The success of such an algorithm hinges on the ability of its scoring function combined with a significance analysis test to discern real motifs from random noise. RESULTS In the first half of the paper we show that the paradigm of ...

متن کامل

Gamot: an Efficient Genetic Algorithm for Finding Challenging Motifs in Dna Sequences

Weak signals that mark transcription factor binding sites involved in gene regulation are considered to be challenging motifs. Identifying these motifs in unaligned DNA sequences is a computationally hard problem which requires efficient algorithms. Genetic Algorithms (GA), inspired from evolution in nature, are a class of stochastic search algorithms which have been applied successfully to man...

متن کامل

Molecular and Bioinformatics Analysis of Allelic Diversity in IGFBP2 Gene Promoter in Indigenous Makuee and Lori-Bakhtiari Sheep Breeds

The aim of this study was to perform molecular and bioinformatics analysis of IGFBP2 gene promoter in association with some economic traits in indigenous Makuee (MS) and Lori-Bakhtiari (LB) breeds. DNA was extracted from blood samples of 120 MS and 200 LB and a 297 bp fragment from the upstream sequences of studied gene was amplified and genotyped by single-strand conformational polymo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing

دوره   شماره 

صفحات  -

تاریخ انتشار 2002